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ABSTRACT 

There are two distinct but related purposes for 
carrying out a "discriminant analysis": (1^ discrimination!, and (2) 
classification. The primary objective of this paper was to review the 
outputs of selected computer programs often used to carry out a 
"discriminant ajialysis" with respect to these two purposes. 
Information provided by the programs on requisite date conditions for 
each type of analysis is discussed. The evidence indicates that to 
say one has carried out a "discriminant analysis" when using any of 
the selected programs would be misleading • The information obtained 
from any of the program is quite inadequate for either of the two 
purposes Dientioned above, (Author) 
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ABSTRACT 



The primary objective of this paper is to review the outputs 
oy selected computer programs often used to carry out a "discriminant 
analysis" with respect to two purposes of such analysis: 1) discrimination, 
and 2) classification. The programs selected are the three BMD programs. 
Information provided by tY.n programs in terms of requisite data conditions 
for each type of analysis is discussed. It is concluded thai to say 
one has carried out a "discriminant analysis" wheu using any of the 
selected programs would be misleading, iiideed^ The information yielded 
directly by any of the programs is quite inadequate for either of the 
two purposes mentioned above. The obtaining of supplemental statistics 
is indicated. 



Use of Some ''Discriminant Ansilysis" 
Computer Programs 

Introduction 

Multivariate statistical theory is by no means new. However, 
applications of many apsects of the theory in edM.cational research have 
only become fairly commonplace ia the past decade or so. Interest in the 
complicated (ia the sense of calculations^ at least) multivariate procedures 
has certainly been enhanced by the adaptation of high speed computers to 
problems of data analyses. Except f'.^r very "small** s6ts of data, there 
has been almost a total reliance on computers by educational researchers 
to carry out the necessary calculations. In some cases of multivariate 
dat.a analysis, problems have arisen out of the widespread use of computer 
programs. It roust be noted that the , problems are usually not inherent in 
the programs themaelyes, but rather. tn how they are used; albeit in- 
sufficient program documentation sometiUnes causes difficulty in use. 
Often times, lack of statistical training and/or experience in data 
analysis contribute to the misuse of computer programs, including misin- 
terpretation of computer output. Problems with, and misuse of ^ computer 
programs have often appeared in two classes of multivariate methods: factor 
analysis and discriminant analysis. The concern in this proper is with the 
latter of these two general and often confusing domains of study. 
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Discriminant Analysis 

The tera 'discriminant analysis*' has cotre to mean different things to 
different people. The original proposed use of the "linear discriminant 
function*' was to classify an object into one of two groups to which it 
must belong (Fisher, 1936) • This classification is made using measures 
on a number of (intercorrelf^ted) variables for each object involved, 

1-ven with- more than tv76 criterion groups , discririinant 
analysis'' in educational applications has, in the past, most generally 
implied some type of classilication or assignment of individuals. However, 
recently the term has taken on extended meaning; that is, the term may 
imply data analysis techniques other than mere classification. Suppose 
we are given the existence of g well-defined populations and a sample 
(or group) of individuals from each population with p measures for each 
individual. Methods used to analyze such data may be dictated by two 
purposes of the analysis: (1) to study group separation in terras of 
variable contribution and in terms of dimensions of separation (discrimi- 
nation), and (2) to set up a rule, based on the p-variate data, which will 
enible us to assign some new individual to the correct population when 
it is not known from which of the g populations he emanates (classifica- 
tion). It may be added that two other purposes might be considered: 
(3) to determine if the g populations are statistically significantly 
separated (separation) , and (4) to estimate distances between pairs of 
populations (estimation) . It may be argued that separation — a la 
rauluivariate analysis of varianca (ILV^'OVA) — is necessarily considered 
prio r to discrirdLna tibn. See Iluberty (1974) for a rnorc coirjplete dis- 
cussion of these four aspects of discriminant analysis. 
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Studies designed with either of the first tvo purposes in view are 
scattered throughout the educational research literature. Discrimination 
aaalyses have recently been employed by Goldman and Warren (1973), Nicholson 
(1973), Whellams (1973), Bausell and Magoon (1972), and Rock, Baird, and 
Linn (1972), Classification was the primary analysis used by Keenen 
and Holmes (1970), Stahmann (1969) , and Ghastian (1969). It must be 
recognized that not all studies which might be included in the latter 
category employ a classification analysis for the purpose mentioned in 
the previous paragraph. Rather, the individuals being classified are 
those whose measures were used in determining the classification rule 
applied. Mor^ will be aaid on this later. 

Requisite Conditions for Discriminant Analysis 
A wealth of research has been reported where the effects of failing to 
meet requisite conditions for univariate parametric statistical methods 
have been studied. The conditions usually considered in these studies are 
those of population normality and homogeneity of variance. In the 
univariate case very substantial departures from normality and/or 
homogeneity do not seem to affect many tests; at least in some senses. 
It is not at all clear that this holds in multivariate tests; relatively 
little empirical research has been done in this area. 

A "discriulnant analysis'* in the sense of discrimination and classifi- 
cation problems may be carried out without directly incorporating significance 
tests. [However, some methodologists might contend that either of these 
two problems ought only be considered after a simple MANOVA yields 
significance.] The conditions of p-variate normality and equality of 
the g population (pxp) covariance matrices are often assumed to be met 
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(in some respects needlessly) in many discriminant analyses. Of course 
rio such assuraptlotis need be made in arriving at the sets of discriminant 
function coefficients through the usual eigenanalysis . The sets of 
coefficients are the eigenvectors associated with the eigenvalues of 
the matrix/ product T*r"^B, where W and B are the (pxp) pooled wl thin-groups 
and betweeh-groups deviation score cross-products matrices, respectively. 
It might be argued that such pooling only makes sense when the population 
covariance matrices are identical; it is noted that Porebski (1956, p. 228) 
debates the need for carrying out a preliminary test fo:. Identical popu- 
lation covariance matrices. In discrimination, the p-'^^arlate normality 
condition is only needed if one desires, or feels compelled to, test 
the discriminant functions for significance. 

.In classification applications p-variate normality is not a require- 
ment; it is only necessary that the population density functions be known 
(Melton, 1963). However, most of the distribution-based formulations 
developed by matheiTiatical statisticians for classification purposes are 
built on multivariate normal densities. [Limited developments have been 
made which are distribution-free in nature (see Kendall, 1966).] Tlie 
inequality of the covariance matrices presents no problem in multivariate 
classification. In fact, differences in variances and covariances can be 
very useful in improving classification accuracy- This is particularly 
true when there is considerable overlap among the groups. An added 
assumption that is often made in a clajslfication analysis in educational 
research is that costs of misclassif ying individuals associated with 
each of the g groups are identical. The situation of unequal costs can 
be easily handled in thi^ computations. 
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Computer Programs 

The primary purpose of this paper Is to review selected computer 
programs in terms of uses for two purposes of a discriminant analysis, 
discrimination and classification. There exists today a variety of computer 
programs available to the ediicatlonal researcher. There are a few very 
general multivariate programs (e«g.> whose by Elliot Cramer and by Jeremy 
Finn) that are available to users. One or more of a number of statis- 
tical computer ''packages*' are readily accessible at most institutions 
BMD, OSIRIS, SAS, and SPSr> are popular packages, IBM distributes a 
Scientific Subroutine Package (SSP) which includes a program designed to 
compute discriminant "functions." There are some books that list a 
number of computer programs (e.g., Veldman; 1967; Cooley and Lohne^^ 1971; 
Overall and Klett, 19\^ , A book devoted exclusively to "discriminant 
analysis" by Eisenbeis pr^i Avery (1972) also offers a set of computer 
programs. Individual computer programs are also available from writers: 
r^feroncfts avo found in such journals as Educational and Psychological 
Measurem ent and Behavioral Science . 

The discriminant analysis programs emphssized in this paper are those 
found in the widely used BMD package (Dixon, 1973); these are the 4M, 
5M, and 7M programs. The titles given to these programs ate: 4M, 
Discriminant Analysis for Two Groups; 5M, Discriminant Analysis for Several 
Groups; i»nd 7M, Stepwise Discriminant Analysis. Because of the relation- 
ship between discriminant analysis in the two-group case and multiple 
regression analysis, the BMD 2R program. Stepwise Regression, will be 
included in the discussion. The three discriminant analysis programs will 
be reviewed individually, as well as relationships among these three and 
the regression program. 
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Discrlisinant Analysis for Two Grotips 

Beyond the basic computaticvui;! results, the output from the 4M 
program includes the (unstandardized) dijcrimlnant function coefficients, 
a measure of distance — Mahalanobis* D — between the two criterion 
groups (i.e., between the two group mean vectors, or centroiis) , the mean 
on the discriminant function for each group, and the discriminant function 
values for each individual (or case) in each group, printed in order of 
numerical value. [A value of an F-statistic, which is a transformation 
of the D value, is also given which may be used to satisfy the third 
purpose of an analysis, separation, mentioned in an earUer section of 
this paper. In using this test one must assume p-variate normality.] 
A word about the discriminant function determined: conputationally 
the coefficients are uot found via the eigenanalysis so often associated 
with discriminant functions (Cooley and Lohnes, 1971, p. 246). However, 
the results are equivalent in the sense that the sets of coefficients 
obtained from the two analyses would be proportional. 

If a purpose of the analysis is discrimination, as described 
earlier, little information is provided. If the value yields signi- 
ficance, then one may conclude there is one significant dimension of 
separation; this being represented by the determined discriminant function. 
No direct information is provided to 1) assess the contribution of each 
variable to the overall separation (which might be done by examining 
standardized coefficients), nor 2) aid in interpreting the discriminant 
function (where the variable versus discriminant function correlations 
might be used) . With some arithmetic manipulation, however, this 
information may be obtained. To get the ith standardized coefficient 



one can multiply the reported coefficient, a^, by the (positive) square 
root of the ith diagonal element of the printed "SUM OF PRODUCTS OF DEV. 
FROM MEANS'* (pxp) matrix; this matrix was denoted by W in the last section. 
The variable versuu discriminant function correlations could be calculated 
from the information reported, but the computations would be fairly ex- 
tensive — they involve matrix products. If one is merely interested 
in the ordering of the variables that would be determined by these 
correlations, a simple set of calculations need only be performed. It 
has been shown that this ordering is identical to that yielded by the 
ordering of the p univariate ANOVA F-values (Huberty, 1972). To determine 
the F-value for the ith variable the following expression is used: 



MS^ (n^ -h n^) 



where d^ is the difference of means on the ith variable, n^ (j»l,2) is 
the jth sample size, and HS^ is the error mean square for variable i. 
The d^- and n^- values are reported and IIS^ may be found by dividing the 
ith diagonal element of W by (n^ + n2 ~2) . 

Based on the output, the primary purpose behind the use of the 4M 
program is necessarily that of classification. Even then, the only 
classification that can be performed is that of the cases or individuals 
on whom the classification statistics were based. That is, there are no 
means of directly classifying "new*^ cases. Furthermore, the two sample 
covariance matrices are pooled in arriving' at the classification statistic, 
which in this situation is merely the discriminant function. This implies 
that the population covariance matrices are assumed identical, which would 
make the use of the linear discriminant function quite appropriate. The 
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output does not provide oufflcient information to determine whether or 
not this condition is met. The mean discriminant function value is re- 
ported for each group. Then, assuming equal costs of misclassif ication 
arid equal prior probabilities of group membership, classification (of 
the cases already considered in detenmlning the discriminant function) is 
simple. Cases whose discriminant function values ar^ closest to the functio 
mean of group j are assigned to group j. The sample proportion of mic- 
classifications may be fcund by a mere count. If one is interacted in 
obtaining an estimate of the true proportion of correct .iasslf ications 
he can use 1>(D/2), where 4> is the standard norrr^'!. distribution function 
and is the reported Mahalanobls distance measure. [Here "function" 
is used in the mathematical sense,] This will yield an estimate that 
tends to be somewhat highe 

The analysis yielc _d by tha 4M program may be repeated using any 
number of specified subsets of the original predictor variables. If the 
user wants to discover what the results would be if one or more variables 
ware deleted, the Selection Card is used. 

Stepwise Regression 
If the research situation is such that only two criterion groups 
are involved^ as when the AM program would be used, it might be well to 
consider the 2R program. The formal equivalence of two-group discriminant 
analysis and multiple regression analysis is well-known. That is, the 
regression coefficients obtained for the two-group situation, when uhe 
dependent variable is group membership, are proportional to the discriminant 
function coefficients. This statement holds when, for both analyses, 
the coefficlentB: considered are those applicable to raw scores. With a 
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regression anAlysls^ measures on the dependent variable are often taken 

to be 0 for all meinbers of one group and 1 for all Tre^nbers of th? other ^roup. 
Coefficients (b^'s) comparable to the discriminant function coefficients 
are outputs of the 2R program; if care is taken in specifying the "F- 
level for inclusion," coefficients may be obtained for all p original 
variables. If desired, coefficients applicable to standardised scores 
(b*'s) may be obtained by multiplying each reported coefficient by the 



product of (n^^ -f n^) /)^ii^2 reported standard deviation (s^) of the 

variable in question; i.e., b* » ^^i^i n^) //n^n^ . If n^^," nj we 

have b* « 2 b^s^. 

Additional information may also be obtained from the 2R program 
( not using the zero regression Intercept option) which may be useful 
in interpreting the separation between the twc groups. First, regression 
equations consisting of different numbers of variables are determined 
in a stepwise manner. Thus an ordering of the variables in terms of their 
contribution to improved prediction is available. Subsets of variables 
may thus be selected, recognizing, of course, that a subset so selected 
may not be the best one of that particular size. Secondly, an ordering 
of the predictors according to discriminant function versus predictor 
correlations or, equlvalently, to univariate F-ratios (or, in this case, 
absolute values of the univariate t-ratios) is possible. Tne F-value 
for the ith predictor is determined by 



2 
•i 



(n, + -2) 



1 ^ r2 ^ 2 

where r^ is the point-blserlal correlation between the 1th predictor and 
the dependent (grouping) variable. The -values are reported in the 
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(optional) output "CORIJELATION MATRIX." If the composite versus ith 
predictor correlation coefficient is of interest, it may easily be 
obtained as 

^ R 

where R.is the multiple correlation coefficient based on all of- the 
predictors (see Cooley and Lohnes, 1971, p. 55 or Mulaik, 1972, p. A04) . 

A measure of ' the distance between the two centroids may also be 
obtained from the 2R output, Tlie value of is given by the relation- 
ship (gee Porebski, 1966), / ' 

d2 , (^1 ^^1 ^ ^2 - R^ . 

V2 1-.r2 

I'/hen the number of CuSes in each of the two groups is the same, 
output from trie 2R program may also be useful for the purpose of classiti- 
cation. By assigning a 1 to cases in Group 1 and 0 to cas6fs in Group 2 
for scorea on the dependent variable, classification results identical to 
those from the 4M program may be obtained by merely requesting the list 
of residuals to be printed. [The Subproblcm Card must be set up so that 
all of the predictors are eventually included in the rcgrfession equation.] 
The proportion of correct classifications is found by counting the resi- 
duals closer to 1 for cases in Group 1 and residuals closer to 0 for 
Cises in Group 2. As with the discriminant function va.\ue.*5 reported with 
the output from the 4M program, having the residuals from the 2R program 
enables the user to tnake interpretations regarding the nisclassif ications ( 
particular cases. 



Discriminant Analysis for Several Groups 
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The output from the 3M program consists of tha basic statistics 

2 

plus a generalized Mahalanobis D value with an associated chi-sqiiare 

value, classification function coefficients and constants, posterior 

probabilities of group membership for each case, and a classification 

summary table. It should be noted that the generalized d2 measure is not 

the same as the measure yielded by the AM program; it is what Rao 

(1952, p. 257) denotes as his V-statistic. It turns out that in the 

two-group situation, V^^T)^ ^n^n^/ (n^ + n^) • This statistic may be used as 

an alternative to Wilks' lambda statistic and, in the two-group situation, 
2 

to Hotellings' T statistic. It is appropriate at this point to discus? 
the resultant classification *'f unctions. " These are net the same as 
the usuhI discriminant functions (Cooley and Lohnes, 1971, p. 246). Rather, 
they are a modification jf the 'Mlnear discriminant scores" discussed 
in Rao (1965, p. 438). The derivation of these functions is based on 
assumptions of multivariate normality and common covariance matrices. 
These function ^ do not take into account possible unequal prior probabil- 
ities of group membershipf whereas Rao's do. In the two-group situation 
the differences of the corresponding coefficients obtained from the 5M 
program are proportional to the coefficients yielded by the 4M program 
(Rao, 1965, p. 489). 

No information is printed which might aid the user in studying 
group separation. In the general g~group situation it is not possible 
to determine relative variable contribution nor dimensions of separation. 
It is not appropriate to rank-order the variables by examining the 
printed coefficients. 
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The 5M program is used basically for the purpose of classification. 
This classification analysis actually amounts to a "reclassification, 
in that each case is assigned to a population depending upon its function 
value which is based on the conglomerate of cases being assigned. That 
is. there are no means of classifying a '*new*' rase into one of the pre- 
determined categories. The classifications are determined by associated 
posterior probability (of groL-p membership) values — this is equivalent to 
basing the classifications on the largest function value obtained for 
each case. Potentially different prior probabilities of group membership 
are not considered. 

Stepwise Discriminant Analysis 
The last BKJ) progr:;n to be reviewed is 7M, Stepwise Discriminant 
Analysis. Of the discrir.unaut analysis programs used in the reported 
literature the 7M progrr::! is probably referenced most often. Its 
widespread use might be attributed to the abundant amount of information 
yielded. Besides group means and standard deviations, wi thin-groups 
covarlance aiv.i correlation matrices are printed. At each step in the 
analysis various statistics are rei;orted; a summary table is also 
printed, and plots of canonical — actually linear discriminant function — 
(deviation) ^^cores are optional. 

The "classification functions" computed in the 7M program are the 
same as those in the 5M program. [The constant terma yielded by the 7M 
program differ from the 5M constants, and are slightly in error.] It 
should be noted that the discriminant function coefficients based on the 
eigenanalysis of W-^R (assuming equal covariance matrices) are printed, 
along with the eigenvalues, following the summary table, in the printout 
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they are labeled ''COEFFICIENTS FOR CANONICAL VARIABLE/' (If p > g, 
only the first (g-1) sets cf coefficients need be examined,] These 
coefficients may be scaled so that they are applicable to standardized 
scores by multiplying each coefficient by the (positive) square root 
of the product of (h-g) and the corresponding diagonal element of the 
printed "WITHIN GROUPS COVARIANCE MATRIX/^ 

Considerable information is present'*;;; ' ' le used for the 

purpose of discrimination. First of .\]]. significant 
dimensions of separation may be deter:::! ;.v ■ i :v.: the reported 

eigenvalues to a significance test (see : ^ p. 165, and 

Harris, 1974) » Following this the user can examine the plots of the 
discriminant scores to ascertain which groups are differentiated by 
which (significant) discriL^inaat function. On the Group Label Card(s) 
different first letters for the g labels ought to be used* Only two- 
dimensional plots are giv'?-n, but typically two functiomi account for 
almost all group separation. [It would aid in the interpretation of 
the functions if the variable-function correlations were available. No 
correlations are printed; however, correlations based on the total-group 
correlation matrix are obtainable thru the use of the 3D program. 
Correlation with Item Deletion. This would require the writing cf a 
ffw FORTRAN statements to obtain the linear composites of the variables 
determined by the discriminant (not classification) function coefficients; 
this might be simpler than using transgeneration cards. These correlations 
may be used for interpretation as "structure coefficients'' (Cooley and 
Lohnes, 1971, p. 2A8. ] In addition to the .scal«^-' coefficients and the 
correlations, a third means of interpretation may be used. This is an 
assessment of variable contribution to group separation provided by the 
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ordering of variables entered into the analysis in a stepwise manner/ 
[As might be expected, in the two-group sitijation the 2R and 7M programs 
yield identical orderings.] Further, the univariate F-statistics may 
be determined from the reported means and standard deviations (Gordon, 
1973), or by using the means and the diagonal elements of the within- 
groups covariance matrix. 

At each step statistics are reported which determine whether or 
not the variables entered signif ic?.ntly separate the criterion popula- 
tions (in a mean vector sense) • In addition, a matrix of F-values 

is given, each F-value being a transformation of a distance measure 
2 

(Ilahalanobis' D ) between pairs of groups (Dixon, 1973* p. 241). The 
inverse of this transtormation would yield distance measures which may 
be helpful in characterizing group differences. If, for ex2.rip"le, 
distances between all pairs of g-1 of the groups are s.nall, yet at the 
same tiae, the jth group is distinctly separated from the other g-i 
groups, it is clear that the only differentiation taking place occurs 
between the jth group and its complement, i.e., the other g-1 groups. 

As with many other discriminant analysis programs, including 
4M and 5M, classification with the 7M program is usually carried out 
on the cases on which the classification statistics are based. Although 
results of classifying *^new** cases would be mere generalizable, results 
of the usual classification do provide descriptive information in that the 
total discriminatory power of the set of predictors may be assessed via 
the proporticvi of correct classifications. ' It xs possible, hwever, 
with the 7H program to classify a group of cases which were not considered 
in determining the classification statistics. This is simply done by 
preceding that group size by a minus sign on the Sample-Size Card. 
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The classification procedare in the 7M program has the restriction of 
assuming equal covaria^ice matrice?^ (in thac W is ased) In determining 
the classification functions. Howeve/, it is different from the procedure 
in -the 5M program in that it incorporates ^^rlor ptobabilities of group 
membership in computing posterior probabilitj.e:*s . The prior probabilities 
to be used may be specified on the Problem Card; the g priors most often 
used are given by the ratios of the group sizes to the total number of 
cases. Results of the classif icationr; are given at eacl^ step in the 
analysis as well as after the fin^ii step. 

Suinnary and Recommendat ions 

Two purposes of a "'discriminant analysis*' are reviewed; those of 
discrimination and classification. The former pertains to a study of 
criterion group separation with respect to predictor variable contri- 
bution and dimensions of separation, while th^* latter involves the 
assignment of cases (individuals or objects) to criterion populations. 
The usual requisite conditions of normality,, homogeneity of dispersion, 
and equal costs of misclassif Ication tre discussed. The primary purpose of 
this paT>er was that of reviewing a set of computer programs designed to 
carry out a "discriminant analysis'* in light of purposes and requisite 
conditions. Interpretation of the outputs Irom these programs is covered, 
along with similarities and differences across program outputs. 

When using the BMD discriminant analysis programs .it is recommended 
that multiple analyses be made; reanalyzing data with the same program, 
and, when appropriate, with different progr^'irs. The programs may be 
used more than once by varying some of the options available; for 
example, using different variable subsets in the All program: using different 
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F-leviels, or variable selection criteria in the 7M program. Running 

analyses on a given. set of data using different BHD programs is also 

helpful; for example, obtaining outputs from 2R, AM, and 7M on the same 

data. It should be noted that such multiple runs, using the^ 6am6 or 

different programs, on the same data may be accomplished by a single 

submission of the data to a computer center. 

Three further recommendations may be made i/hen using the BMD programs. 

One is to use appropriate prior probabilities in the 7M programs. Unless 
results from past research on similar variables is available, and unless 

other theoretical considerations can be used to a^^sess prior probabilities 
of group membership, it is well to use priors of n^/Lnj. Another recom- 
mendation pertains to estimation of proportions of correct classifications 
or of misclassif icatlons. If the number of cases to be classified is 
large enough, it would be well to use the validation procedure afforded 
by the 7M program to classify new cases (see, however, Horst, 1966, 
pp. 139--140). To do this one can use what is called a **holdout sample." 
A third recommendation is to examine multi-unlvariate analyses to screen 
data prior to using, say, the 7M program (see Huberty, 1974). 

The BMB programs yield information which may be used in subsequent 
calculations to determine statistics for more complete interpretation. 
For example, dlscriininant coefficients applicable to standardized scores 
may be determined from output of both the 4M and 7M programs, as well as 
from the 2R output in the two-group situation. Univariate F-values are 
also obtainable from the AM and 7!! output, as are correlations between 
predictors and discriminant functions. These three statistics, plus 
the ordering of variables entered as determined by the 2R and 7M 
programs, can be examined in assessing variable contribution to 
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to separation and in interpreting the discriminant fur.ctions (see 
Huberty, 1971; Tatsuoka, 1973, p. 280). 

Depending upon the purpose of a study and resources available the 
researcher might do well to use other computer programs in lieu of, 
or in addition to, the B\<D programCs) selected. In this way other 
statistics may be examined, e.g., test statistics and classification 
statistics. In particular, it is advised that programs using quadratic 
classification functions which do not require equal covariance matrices 
be selected when the data are such that linear functions are inappropriate. 
It is of interest to note that a new BMD program is now available; this 
program requires some special hardware, and may be obtained for a small 
cost. This new program, which is discussed by Dixon and Jenrich (1973), 
has three very promising added features: provision for (1) more 
meaningful graphic interpretation of results, (2) the handling of the 
unequal covariance structure problem, and (3) specifying relative costs 
of misclassif ication as well as prior probabilities for each group. 



FOOTNOTE 



It ought to be noted thai: this use of the term ^'function" is not 
mathematically correct. However, tradition will be followed in this 
paper by using the term to meau a linear composiue. 
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